Skip to content

Allow custom OptimizerHints#2216

Open
altmannmarcelo wants to merge 1 commit intoapache:mainfrom
altmannmarcelo:extend_optimizer_hints_upstream
Open

Allow custom OptimizerHints#2216
altmannmarcelo wants to merge 1 commit intoapache:mainfrom
altmannmarcelo:extend_optimizer_hints_upstream

Conversation

@altmannmarcelo
Copy link
Contributor

Two orthogonal improvements to optimizer hint parsing:

  1. Option<OptimizerHint> -> Vec<OptimizerHint>: the old Option silently dropped all but the first hint-style comment. Vec preserves all hint comments the parser encounters, letting consumers decide which to use. This is backwards compatible: optimizer_hint: None becomes optimizer_hints: vec![], and optimizer_hint.unwrap() becomes optimizer_hints[0].

  2. Generic prefix extraction: the /*+...*/ pattern is an established convention. Various systems extend it with /*prefix+...*/ where the prefix is opaque alphanumeric text before +. Rather than adding a new dialect flag or struct for each system, the parser now captures any [a-zA-Z0-9]* run before + as a prefix field. Standard hints have prefix: "". No new dialect surface -- same supports_comment_optimizer_hint() gate. This makes OptimizerHint a generic extension point: downstream consumers can define their own prefixed hint conventions and filter hints by prefix, without requiring any changes to the parser or dialect configuration.

…efix

Two orthogonal improvements to optimizer hint parsing:

1. `Option<OptimizerHint>` -> `Vec<OptimizerHint>`: the old Option silently
   dropped all but the first hint-style comment. Vec preserves all hint
   comments the parser encounters, letting consumers decide which to use.
   This is backwards compatible: `optimizer_hint: None` becomes
   `optimizer_hints: vec![]`, and `optimizer_hint.unwrap()` becomes
   `optimizer_hints[0]`.

2. Generic prefix extraction: the `/*+...*/` pattern is an established
   convention. Various systems extend it with `/*prefix+...*/` where the
   prefix is opaque alphanumeric text before `+`. Rather than adding a new
   dialect flag or struct for each system, the parser now captures any
   `[a-zA-Z0-9]*` run before `+` as a `prefix` field. Standard hints have
   `prefix: ""`. No new dialect surface -- same `supports_comment_optimizer_hint()`
   gate. This makes OptimizerHint a generic extension point: downstream
   consumers can define their own prefixed hint conventions and filter hints
   by prefix, without requiring any changes to the parser or dialect
   configuration.
@altmannmarcelo altmannmarcelo force-pushed the extend_optimizer_hints_upstream branch from 0a5df55 to 4e2c3ac Compare February 12, 2026 23:31
Copy link
Contributor

@xitep xitep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hello @altmannmarcelo,

i quite like your generalisation if it makes the feature useful for dialects supporting multiple hints! 👍 so i'm in for the change.

however, i wish you would introduce dialect flags to guide the parser to 1) allow the prefixes, and 2) allow accepting multiple hints (the AST can nicely present that with your suggestion of Vec<OptimizerHint>.)

yes, you wrote that downstream programs can do the validation themselves. and indeed they can. but, when you start writing your 3rd sql processor based on sqlparser, it become tiresome to repeat those validations in all of them (or to start maintaining a separate crate for these validations.) the great thing about sqlparser is that it has the concept of "dialects" and provides a common AST for all of them, yet is able to distinguish between the dialects' "idiosyncrasies." (having said that, i'm no authority and don't have a say in how sqlparser-rs wants to evolve.)

Some((before_plus.to_string(), text.to_string()))
} else {
None
}
Copy link
Contributor

@xitep xitep Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think using str::split_once would make this shorter and leaner (possibly slightly more efficient), e.g. https://gist.github.com/rust-play/146d81960095525d6384f34d84ac7419

assert_eq!(select.optimizer_hints.len(), 2);
assert_eq!(select.optimizer_hints[0].text, "one two three");
assert_eq!(select.optimizer_hints[0].prefix, "");
assert_eq!(select.optimizer_hints[1].text, "not a hint!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, this test was to assert that "not a hint!" is in fact "not an optimizer hint!" :)

@altmannmarcelo
Copy link
Contributor Author

Thanks for the review @xitep, and glad you like the generalization!

I understand the appeal of dialect flags — they're great when the parser can definitively say "this syntax is valid/invalid for dialect X." However, I think optimizer hints are a different category. The /*+ ... */ pattern is an established convention precisely because it's a standard way to extend functionality without requiring grammar changes. Different systems (and even different versions of the same system)
recognize different hint keywords and prefixes, and the parser shouldn't need to know about all of them.

The current approach keeps the parser's responsibility focused: if the dialect supports optimizer hints (supports_comment_optimizer_hint()), collect them all and let the downstream consumer decide which ones are
relevant. This is intentional — adding dialect flags for which prefixes are allowed or how many hints are accepted would couple the parser to specific hint conventions that are really consumer-level concerns.

For downstream projects, the filtering is straightforward — match on prefix and ignore what you don't recognize. This is simpler and more flexible than maintaining dialect flags that would need updating as hint
conventions evolve. It also means new prefix conventions can be adopted by consumers without any changes to sqlparser-rs.

@xitep
Copy link
Contributor

xitep commented Feb 13, 2026

i agree it will make the maintenance of sqlparser simpler by pushing the responsibilities downstream. and yes, in this case, with the optimizer hints, it is straightforward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants